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Abstract 

We study the optimal transport problem in sub-Ricmannian manifolds where 
the cost function is given by the square of the sub-Riemannian distance. Under ap- 
propriate assumptions, we generalize Brenier-McCann's Theorem proving existence 
and uniqueness of the optimal transport map. We show the absolute continuity 
property of Wassertein geodesies, and we address the regularity issue of the op- 
timal map. In particular, we are able to show its approximate differentiability 
a.e. in the Heisenberg group (and under some weak assumptions on the measures 
the differentiability a.e.), which allows to write a weak form of the Monge- Ampere 
equation. 

1 Introduction 

The optimal transport problem can be stated as follows: given two probability measures 
H and v, denned on measurable spaces X and Y respectively, find a measurable map 
T :X with 

Tjj/x = v (i.e. v{A) = \x (T -1 (A)) for all A C Y measurable), 

and in such a way that T minimizes the transportation cost. This last condition means 



c(x,T(x)) dfj,(x) = min < / c(x, S(x)) d/j,(x] 
x s i^= v [Jx 

where c : X x Y — > 1R is some given cost function, and the minimum is taken over all 
measurable maps S : X — > Y with S^n = v. When the transport condition T^\x = v is 
satisfied, we say that T is a transport map, and if T minimizes also the cost we call it an 
optimal transport map. Up to now the optimal transport problem has been intensively 
studied in a Euclidean or a Riemannian setting by many authors, and it turns out 
that the particular choice c(x,y) = d 2 (x,y) (here d denotes a Riemannian distance) 
is suitable for studying some partial differential equations (like the semi-geostrophic 
or porous medium equations), for studying functional inequalities (like Sobolev and 
Poincare-type inequalities) and for applications in geometry (for example, in the study 



'Universite de Nice-Sophia Antipolis, Labo. J.-A. Dieudonne, UMR 6621, Pare Valrose, 06108 Nice 
Cedex 02, France (figalli@unice.fr) 

^Centre de Mathematiques Laurent Schwartz, Ecole Polytechnique, 91128 Palaiseau, France 
(f igalliOmath . polytechnique . f r) 

"'"Universite de Nice-Sophia Antipolis, Labo. J.-A. Dieudonne, UMR 6621, Pare Valrose, 06108 Nice 
Cedex 02, France (rifford@unice.fr) 



1 



of lower bound on the Ricci curvature of the manifolds). We refer to the books [71 1371 [38] 
for an excellent presentation. 

After the existence and uniqueness results of Brenier for the Euclidean case [12] 
and McCann for the Riemannian case [2S], people tried to extend the theory in a sub- 
Riemannian setting. In [8j Ambrosio and Rigot studied the optimal transport problem 
in the Heisenberg group, and recently Agrachev and Lee were able to extend their 
result to more general situations such as sub-Riemannian structures corresponding to 
2-generating distributions [3]. 

Two key properties of the optimal transport map result to be useful for many ap- 
plications: the first one is the fact that the transport map is differentiable a.e. (this 
for example allows to write the Jacobian of the transport map a.e.), and the second 
one is that, if \x and v are absolutely continuous with respect to the volume measure, 
so are all the measures belonging to the (unique) Wasserstein geodesic between them. 
Both these properties are true in the Euclidean case (see for example [7]) or on compact 
Riemannian manifolds (see [191 11U|). If the manifold is noncompact, the second prop- 
erty still remains true (see [211 Section 5]), while the first one holds in a weaker form. 
Indeed, although one cannot hope for its differentiability in the non-compact case, as 
proved in |23[ Section 3] (see also [7]) the transport map is approximately differentiable 
a.e., and this turns out to be enough for extending many results from the compact to 
the non-compact case. Up to now, the only available results in these directions in a 
sub-Riemannian setting were proved in [24] . where the authors show that the absolute 
continuity property along Wassertein geodesies holds in the Heisenberg group. 

The aim of this paper is twofold: on the one hand, we prove new existence and 
uniqueness results for the optimal transport map on sub-Riemannian manifolds. In 
particular, we show that the structure of the optimal transport map is more or less the 
same as in the Riemannian case (see [28J). On the other hand, in a still large class of 
cases, we prove that the transport map is (approximately) differentiable almost every- 
where, and that the absolute continuity property along Wasserstein geodesies holds. 
This settles several open problems raised in [51 Section 7]: first of all, regarding prob- 
lem [5J Section 7 (a)] , we are able to extend the results of Ambrosio and Rigot [5J and 
of Agrachev and Lee [3] to a large class of sub-Riemannian manifolds, not necessarily 
two-generating. Concerning question [8, Section 7 (b)], we can prove a regularity result 
on optimal transport maps, showing that under appropriate assumptions (including 
the Heisenberg group) they are approximately differentiable a.e. Moreover, under some 
weak assumptions on the measures, the transport map is shown to be truly differen- 
tiable a.e. (see Theorem 13.71 and Remark 13. 8|) . This allows for the first time in this 
setting to apply the area formula, and to write a weak formulation of the Monge- Ampere 
equation (see Remark 13. 9p . Finally, Theorem 13.51 answers to problem [8, Section 7 (c)] 
not only in the Heisenberg group (which was already solved in |24j ) but also in more 
general cases. 

The structure of the paper is the following: 

In Section [21 we introduce some concepts of sub-Riemannian geometry and optimal 
transport appearing in the statements of the results. 

In Section we present our results on the mass transportation problem in sub- 
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Riemannian geometry: existence and uniqueness theorems on optimal transport maps 
(Theorems l3.2l and l3.3|) . absolute continuity property along Wasserstein geodesies (The- 
orem I3.5p . and finally regularity of the optimal transport map and its consequences 
(Theorem 13.71 and Remarks 13.81 [3~U|) . For sake of simplicity, all the measures appearing 
in these results are assumed to have compact supports. In the last paragraph of Section 
[3] we discuss the possible extensions of our results to the non-compact case. 

In Section 01 we give a list of sub- Riemannian structures for which our different 
results may be applied. These cases include fat distributions, two-generating distribu- 
tions, generic distribution of rank > 3, nonholonomic distributions on three-dimensional 
manifolds, medium-fat distributions, codimension-one nonholonomic distributions, and 
rank-two distributions in four-dimensional manifolds. 

Since the proofs of the theorems require lots of tools and results from sub- Riemannian 
geometry, we recall in Section \E\ basic facts in sub- Riemannian geometry, such as the 
characterization of singular horizontal paths, the description of sub-Riemannian mini- 
mizing geodesies, or the properties of the sub-Riemannian exponential mapping. Then, 
we present some results concerning the regularity of the sub-Riemannian distance func- 
tion and its cut locus. These latter results are the key tools in the proofs of the our 
transport theorems. 

In Section [6J taking advantage of the regularity properties obtained in the previous 
section, we provide all the proofs of the results stated in Section [3j 

Finally, in Appendix A, we recall some classical facts on semiconcave functions, 
while in Appendix B we prove auxiliary results needed in Section HI 

2 Preliminaries 

2.1 Sub-Riemannian manifolds 

A sub-Riemannian manifold is given by a triple (M, A, g) where M denotes a smooth 
connected manifold of dimension n, A is a smooth nonholonomic distribution of rank 
m < n on M, and g is a Riemannian metric on ilfl We recall that a smooth distribution 
of rank m on M is a rank m subbundle of TM. This means that, for every x G M, 
there exist a neighborhood V x of x in M, and a m-tuple (/f , . . . , /^) of smooth vector 
fields on V x , linearly independent on V x , such that 

A(z) = Span{/f(z), . . .,/*(*)} Vz G V x . 

One says that the m-tuple of vector fields (/f , . . . , /^) represents locally the distribution 
A. The distribution A is said to be nonholonomic (also called totally nonholonomic 
e.g. in [lj) if, for every x G M, there is a m-tuple (/f , . . . , /^) of smooth vector fields 
on V x which represents locally the distribution and such that 

Lie {f f^}(z) = T z M VzGV,, 

1 Note that in general the definition of a sub-Riemannian structure only involves a Riemannian 
metric on the distribution. However, since in the sequel we need a global Riemannian distance on the 
ambient manifold and we need to use Hessians, we prefer to work with a metric defined globally on 
TM. 
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that is such that the Lie algebrad spanned by ff , . . . , is equal to the whole tangent 
space T Z M at every point z G V x . This Lie algebra property is often called Hormander's 
condition. 

A curve 7 : [0, 1] — > M is called a horizontal path with respect to A if it belongs to 
W^flp.lJ.M) and satisfies 

7(i) G A(7(i)) for a.e. t € [0,1]. 

According to the classical Chow-Rashevsky Theorem (see [9j [T7] [301 E21 [33] ) , since the 
distribution is nonholonomic on M, any two points of M can be joined by a horizontal 
path. That is for every x, y G M there exists a horizontal path 7 : [0,1] — > M such 
that 7(0) = x and 7(1) = y. For a; G M, let Oa(^) denote the set of horizontal paths 
7 : [0, 1] — ► M such that 7(0) = x. The set Qa(x), endowed with the W /1,2 -topology, 
inherits a Hilbert manifold structure (see [30]). The end-point mapping from x is 
defined by 

E x : Q A (x) — ► M 

7 1 — ► 7(1). 

It is a smooth mapping. A path 7 is said to be singular if it is horizontal and if it is 
a critical point for the end-point mapping E x; that is if the differential of E x at 7 is 
singular (i.e. not onto). A horizontal path which is not singular is called nonsingular 
or regular. Note that the regularity or singularity property of a given horizontal path 
depends only on the distribution, not on the metric g. 

The length of a path 7 G is defined by 

length^) := J yj ' g l[t) (j(t)^(t))dt. (2.1) 

The sub-Riemannian distance dsR(x,y) (also called Carnot-Caratheodory distance) 
between two points x,y of M is the infimum over the lengths of the horizontal paths 
joining x and y. Since the distribution is nonholonomic on M, according to the Chow- 
Rashevsky Theorem (see [?J] [TY] [3U1 [33] ) the sub-Riemannian distance is finite and 
continuous^! on M x M. Moreover, if the manifold M is a complete metric spac^ll for 
the sub-Riemannian distance dsR, then, since M is connected, for every pair x,y of 
points of M there exists a horizontal path 7 joining x to y such that 

dsR(x,y) = length 9 (7). 

2 We recall that, for any family T of smooth vector fields on M, the Lie algebra of vector fields 
generated by T, denoted by Lie(^ r ), is the smallest vector space S satisfying 

[x, Y]cs vxef, \/Y e s, 

where [X, Y] is the Lie bracket of X and Y. 

3 In fact, thanks to the so-called Mitchell's ball-box Theorem (see |30]). the sub-Riemannian distance 
can be shown to be locally Holder continuous on M x M. 

4 Note that, since the distribution A is nonholonomic on M, the topology defined by the sub- 
Riemannian distance dsn coincides with the original topology of M (see [5] [30]). Moreover, it can be 
shown that, if the Riemannian manifold (M,g) is complete, then for any nonholonomic distribution A 
on M the sub-Riemannian manifold (M, A, g) equipped with its sub-Riemannian distance is complete. 
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Such a horizontal path is called a sub-Riemannian minimizing geodesic between x and y. 

Assuming that (M, dsn) is complete, denote by T*M the cotangent bundle of M, 
by u the canonical symplectic form on T*M, and by tt : T*M — > M the canonical 
projection. The sub-Riemannian Hamiltonian H : T*M — > H which is canonically 
associated with the sub-Riemannian structure is defined as follows: for every x € M, 
the restriction of H to the fiber T*M is given by the nonnegative quadratic form 

p max | u € A(z) \ {0} . (2.2) 

Let -ff denote the Hamiltonian vector field on T*M associated to H, that is tj*uj = 

—dH. A normal extremal is an integral curve of H defined on [0,1], i.e. a curve 
V>(-) : [0, 1] -> T*M satisfying 

m = H(m), ViG [0,1]. 

Note that the projection of a normal extremal is a horizontal path with respect to A. 
For every x £ M, the exponential mapping with respect to x is defined by 

exp x : T*M — ► M 

p i — > 7r(V»(l)), 

where ^ is the normal extremal such that ^(0) = (x,p) in local coordinates. We stress 
that, unlike the Riemannian setting, the sub-Riemannian exponential mapping with 
respect to x is defined on the cotangent space at x. 

Remark: from now on, all sub-Riemannian manifolds appearing in the paper are 
assumed to be complete with respect to the sub-Riemannian distance. 



2.2 Preliminaries in optimal transport theory 

As we already said in the introduction, we recall that, given a cost function c:IxF^ 
IR,, we are looking for a transport map T : X -^>Y which minimizes the transportation 
cost J c(x,T(x)) dfi. The constraint T#/j, = v being highly non-linear, the optimal 
transport problem is quite difficult from the viewpoint of calculus of variation. The 
major advance on this problem was due to Kantorovich, who proposed in [251 [26] 
a notion of weak solution of the optimal transport problem. He suggested to look 
for plans instead of transport maps, that is probability measures 7 in X x Y whose 
marginals are \x and v, i.e. 

(vrx)t)7 = A 4 and ( 7r y)tt7 = *A 

where ttx :IxY->I and iry : X x Y ^ Y are the canonical projections. Denoting 
by n( / u, v) the set of plans, the new minimization problem becomes the following: 



C(n,v)= mia c(x,y) dj(x,y) } . (2.3) 

7611(^,1/) UmxM 
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If 7 is a minimizer for the Kantorovich formulation, we say that it is an optimal plan. 
Due to the linearity of the constraint 7 G II (^,^), it is simple using weak topologies 
to prove existence of solutions to (j2.3|) : this happens for instance whenever X and 
Y are Polish spaces, and c is lower semicontinuous and bounded from below (see for 
instance |37l 38j). The connection between the formulation of Kantorovich and that of 
Monge can be seen by noticing that any transport map T induces the plan defined by 
(Id x T)$fj,, which is concentrated on the graph of T. Hence the problem of showing 
existence of optimal transport maps can be reduced to prove that an optimal transport 
plan is concentrated on a graph. Moreover, if one can show that any optimal plan in 
concentrated on a graph, since 71 + 72 is optimal if so are 71 and 72, uniqueness of the 
transport map easily follows. 

Definition 2.1. A function (j) : X — > R is said c-concave if there exists a function 
(f) c : Y — > II U {— 00}, with (j) c ^ —00, such that 

4>(x) = inf {c(x, y) - c (y)} . 

yeY 

If (j) is c-concave, we define the c-superdifferential of (j) at x as 

d c 4>(x) := {y G Y I (j){x) + 4> c {y) = c{x, y)}. 

Moreover we define the c-superdifferential of cf> as 

d c (j) := {(x,y) G X x Y \ y G d c (f){x)}. 

As we already said in the introduction, we are interested in studying the optimal 
transport problem on M x M {M being a complete sub-Riemannian manifold) with 
the cost function given by c{x,y) = d 2 SR {x,y). 

Definition 2.2. Denote by P C {M) the set of compactly supported probability measures 
in M and by P2{M) the set of Borel probability measures on M with finite 2-order 
moment, that is the set of [i satisfying 

dg R (x,Xo) dfi(x) < +00 for some xq G M. 

Furthermore, we denote by P c ac (M) (resp. P 2 ac (M)) the subset of P C (M) (resp. -P 2 (M)) 
that consists of the probability measures on M which are absolutely continuous with 
respect to the volume measure. 

Obviously P C {M) C P2(M). Moreover we remark that, by the triangle inequality 
for dsR, the definition of P2(M) does not depend on xq. The space P2(M) can be 
endowed with the so-called Wasserstein distance W2: 

W%(n,i>):= min <^ / d 2 (x,y) dj(x,y) 
7eno,i/) Umxm 

(note that W% is nothing else than the infimum in the Kantorovich problem). As 
W2 defines a finite metric on P2(M), one can speak about geodesic in the metric space 
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(i-2, W2). This space turns out, indeed, to be a length space (see for example [TJETl 38J). 

From now on, supp(/z) and supp(z^) will denote the supports of \i and v respectively, 
i.e. the smallest closed sets on which [i and v are respectively concentrated. 
The following result is well-known (see for instance [38, Chapter 5]): 

Theorem 2.3. Let us assume that n,v G P<i{M). Then there exists a c-concave func- 
tion (ft such that the following holds: a transport plan 7 G II(/x, v) is optimal if and only 
if l(d c cp) = 1 (that is 7 is concentrated on the c-superdifferential of (ft). Moreover one 
can assume that the following holds: 

<j>{x) = inf {d 2 SR (x,y) - (ft c {y)} Vx G M, 

yesupp(v) 

4> c (y)= inf f U 2 SR {x,y)- cft{x)} Vy G M. 

In addition, if jx,v G P C {M), then both infima are indeed minima (so that d c (ft{x) n 
supp(z^) 7^ for [i-a.e. x), and the functions (ft and (ft c are continuous. 

By the above theorem we see that, in order to prove existence and uniqueness of 
optimal transport maps, it suffices to prove that there exist two Borel sets Z\,Zi C M, 
with n(Z\) = v{Z<i) = 1, such that d c (ft is a graph inside Z\ x Zi (or equivalently that 
d c (ft(x) (~l Z2 is a singleton for all x G Z\). 



3 Statement of the results 

3.1 Sub-Riemannian versions of Brenier-McCann's Theorems 

The main difficulty appearing in the sub-Riemannian setting (unlike the Riemannian 
situation) is that, in general, the squared distance function is not locally Lipschitz 
on the diagonal. This gives rise to difficulties which make the proofs more technical 
than in the Riemannian case (and some new ideas are also needed). In order to avoid 
technicalities which would obscure the main ideas of the proof, we will state our results 
under some simplifying assumptions on the measures, and in Paragraph 13.41 we will 
explain how to remove them. 

Before stating our first existence and uniqueness result, we introduce the following 
definition: 

Definition 3.1. Given a c-concave function (ft : M — > H, we define the "moving" set 
and the "static" set as 

:= {x G M I x £ d c (ft(x)}, 
:= M \ = {x G M I x G d c (ft(x)}. 

We will also denote by 717 : M x M — > M and 112 : M x M — ► M the canonical 
projection on the first and on the second factor, respectively. In the sequel, D denotes 
the diagonal in M x M, that is 

D := {(x, y) G M x M \ x = y} . 
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Furthermore, we refer the reader to Appendix A for the definition of a locally semicon- 
cave function. 

Theorem 3.2 (Optimal transport map for absolutely continuous measures). 

Let \x G P^ C (M), v G P C (M). Assume that there exists an open set f2 C M x M such 
that supp(// xi/)cfl, and dg R is locally semiconcave (resp. locally Lipschitz) on Q\D. 
Let (ft be the c-concave function provided by Theorem \2.3[ Then: 

(i) hA^ is open, and eft is locally semiconcave (resp. locally Lipschitz) in a neighbor- 
hood of n suppQu). In particular (ft is differentiable \x-a.e. in A4^. 

(ii) For \i-a.e. x G , d c <ft(x) = {x}. 

In particular, there exists a unique optimal transport map defined fx-a.e. b^ 

T t x \ . = [ ex Px(-^#(») if x G A4 nsupp(//), 
\ x if x G fl supp(/z), 

and for [i-a.e. x there exists a unique minimizing geodesic between x and T(x). 

The two main issues in the proof of the above theorem are the regularity of the 
c-concave function (ft provided by Theorem 12.31 and the existence and uniqueness of 
minimizing projections of normal extremals between almost all pairs of points in d c (ft. 
Roughly speaking, the regularity properties of (ft are consequences of regularity assump- 
tions made on the cost function while the second issue is tackled (as it was already done 
by Agrachev and Lee in [3]) by transforming a problem with end-point constraint into 
a problem with free end-point (see Proposition 15. 5p . Furthermore, as can be seen from 
the proof (given in Section [6j), assertion (ii) in Theorem 13.21 always holds without any 
assumption on the sub-Riemannian distance. That is, for any optimal transport prob- 
lem on a complete sub-Riemannian manifold between two measures fx G P£ C {M) and 
v G P C (M), we always have 

d c (ft{x) = {x} for fi-a.e. x G S^, 

where (ft is the c-concave function provided by Theorem 12.31 Such a result is a con- 
sequence of a Pansu-Rademacher Theorem which was already used by Ambrosio and 
Rigot in [8]. 

Theorem 13.21 above can be refined if the sub-Riemannian distance is assumed to 
be locally Lipschitz on the diagonal. In that way, we obtain the sub-Riemannian ver- 
sion of McCann's Theorem on Riemannian manifolds (see [25]), improving the result 
of Agrachev and Lee (see [3]). 

Theorem 3.3 (Optimal transport map for more general measures). Let //, v G 

P C (M), and suppose that ii gives no measure to countably (n—l)-rectifiable sets. Assume 
that there exists an open set Q C M x M such that supp(/i x v) C £1, and d 2 SR is locally 
semiconcave on Q\D. Suppose further that d 2 SR is locally Lipschitz on Q, and let (ft be 
the c-concave function provided by Theorem \2.'A Then: 

5 The factor | appearing in front of d(j)(x) is due to the fact that we are considering the cost function 
d 2 SR (x,y) instead of the (equivalent) cost ^d% R (x,y) 
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(i) is open, and <j> is locally semiconcave in a neighborhood of Hsupp(^). In 
particular <p is differentiable [i-a.e. in M.^ . 

(ii) For /i-a.e. x G ', d c (j)(x) = {x}. 



In particular, there exists a unique optimal transport map defined [i-a.e. by 




exp a ,(— | dcf)(x)) if x G M.^ n supp(^), 
x if x G PI supp(/i), 



and for [i-a.e. x there exists a unique minimizing geodesic between x and T(x). 

The regularity properties of the sub-Riemannian distance functions required in the 
two results above are satisfied by many sub-Riemannian manifolds. In particular The- 
orem 13.21 holds as soon as there are no singular sub-Riemannian minimizing geodesic 
between two distinct points in 0. In Section [H we provide a list of sub-Riemannian 
manifolds which satisfy the assumptions of our different results. 

3.2 Wasserstein geodesies 

Thanks to Theorem 13.21 it is not difficult to deduce the uniqueness of the Wasserstein 
geodesic between \x and v. Moreover the structure of the transport map allows to prove, 
as in the Riemannian case, that all the measures inside the geodesic are absolutely 
continuous if \i is. This last property requires however that, if (x,y) G ft, then all 
geodesies from x to y do not "exit from Q" : 

Definition 3.4. Let C M x M be an open set. We say that ft is totally geodesically 
convex if for every (x,y) G and every geodesic 7 : [0, 1] — ► M from x to y, one has 



Observe that, if Q = U X U with U C M, then the above definition reduces to say 
that U is totally geodesically convex in the classical sense. 

Theorem 3.5 (Absolute continuity of Wasserstein geodesies). Let fi G P c ac (M) ; 
v G P C (M). Assume that there exists an open set f2 C MxM such that supp(/ixz/) C Q, 
and d SR is locally semiconcave on Q\D. Let eft be the c-concave function provided by 
Theorem \2.3[ Then there exists a unique Wasserstein geodesic (^t)te[o,i] joining \i = hq 
to v = fj,i, which is given by fi t := (Tt)#/jL for t G [0, 1], with 



Moreover, if Q is totally geodesically convex, then /it G P^ C (M) for all t G [0, 1). 

3.3 Regularity of the transport map and the Monge- Ampere equation 

The structure of the transport map provided by Theorem 13.21 allows also to prove in 
certain cases the approximate differentiability of the optimal transport map, and a 
useful Jacobian identity. Let us first recall the notion of approximate differential: 



(x, 7 (t)), ( 7 (t), y) g n 



Vi G [0, 1]. 
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Definition 3.6 (Approximate differential). We say that / : M — ► H has an ap- 
proximate differential at x £ M if there exists a function h : M — ► H differentiable at x 
such that the set {/ = /i} has density 1 at x with respect to the volume measure. In this 
case, the approximate value of / at x is defined as f(x) = h(x), and the approximate 
differential of / at x is defined as df(x) = dh(x). 

It is not difficult to show that the above definitions make sense. In fact, h(x) and 
dh(x) do not depend on the choice of h, provided x is a density point of the set {/ = h}. 

To write the formula of the Jacobian of T, we will need to use the notion of Hessian. 
We recall that the Hessian of a function / : M — > H is defined as the covariant derivative 
of df: Hess/(x) = Vdf{x) : T X M x T X M — > M. Observe that the notion of the Hessian 
depends on the Riemannian metric on TM. However, since the transport map depends 
only on dsR, which in turn depends only on the restriction of metric to the distribution, 
a priori it may seem strange that the Jacobian of T is expressed in terms of Hessians. 
However, as we will see below, the Jacobian of T depends on the Hessian of the function 
z I— ► (j)(z) — d 2 SR (z,T(x)) computed at z = x. But since <j>(z) — d 2 SR (z,T(x)) attains a 
maximum at critical point for the above function, and so its Hessian at x is 

indeed independent on the choice of the metric. 

The following result is the sub-Riemannian version of the properties of the transport 
map in the Riemannian case. It was proved on compact manifolds in [19], and extended 
to the noncompact case in [23]. The main difficulty in our case comes from the fact 
that the structure of the sub-Riemannian cut-locus is different with respect to the 
Riemannian case, and so many complications arise when one tries to generalize the 
Riemannian argument to our setting. Trying to extend the differentiability of the 
transport map in great generality would need some new results on the sub-Riemannian 
cut-locus which go behind the scope of this paper (see the Open Problem in Paragraph 
15.81) . For this reason, we prefer to state the result under some simplifying assumptions, 
which however holds in the important case of the Heisenberg group (see [30]), or for 
example for the standard sub-Riemannian structure on the three-sphere (see [llj). 

We refer the reader to Paragraph 15.81 for the definitions of the global cut-locus 
Cutsji(M). 

Theorem 3.7 (Approximate differentiability and jacobian identity). Let \i € 

P£ C {M), v £ P C (M). Assume that there exists a totally geodesically convex open set fl C 
M x M such that supp(/U x v) C Q, d 2 SR is locally semiconcave on Q\D, and for every 
(x,y) E Cutsn(M) n (fl \ D) there are at least two distinct sub-Riemannian minimizing 
geodesies joining x to y. Let (ft be the c- concave function provided by Theorem \2.3l 
Then the optimal transport map is differentiable for \i-a.e. x G M.^ PI supp( / u), and it 
is approximately differentiable jjL-a.e. Moreover 

Y(x) := d(exp x )_i d ^ and H(x) := -Hess d 2 SR (-, T(x))\ z=x 

exists for [i-a.e. x E n supp(/i), and the approximate differential of T is given by 
the formula 

j , s_ J Y(x) (H (x) - |Hess <f>(x)) if x e n supp(/x), 

[x) ~\ id ifxeS'f'n supp(Ai), 
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where Id : T X M —* T X M denotes the identity map. 

Finally, assuming both /i and v absolutely continuous with respect to the volume 
measure, and denoting by f and g their respective density, the following Jacobian iden- 
tity holds: 

det(JT(x)) = J±^L^O M -a.e. (3.1) 
g(T(x)) 

In particular, f(x) = g(x) for [i-a.e. x G n supp(/i). 

Remark 3.8 (Differentiability a.e. of the transport map). If we assume that 
/ 7^ g /i-a.e., then by the above theorem we deduce that T(x) ^ x /i-a.e. (or equivalently 
x g" d c (p(x) ^-a.e.). Therefore the optimal transport is given by 

T(x) = exp a ,(— T;d(f)(x)) /i-a.e., 

and in particular T is differentiable (and not only approximate differentiable) /i-a.e. 

Remark 3.9 (The Monge- Ampere equation). Since the function z i— > <fi(z) — 

d 2 SR (z,T{x)) attains a maximum at T(x) for /i-a.e. x, it is not difficult to see that the 
matrix H(x) — |Hess^>(x) (defined in Theorem 13 .7p is nonnegative definite /i-a.e. This 
fact, together with (j3. If) . implies that the function <j) satisfies the Monge- Ampere type 
equation 

det (H(x) - ±Hess <f>(x)) = | dot( y ( {ffi g(T(a)) for /i-a.e. xeM*. 
In particular, thanks to Remark 



det (if (x) - iHess <f>(x)) = , det(y( {ffi g(r(3;)) M-a.e. 
provided that f ^ g /i-a.e. 

3.4 The non-compact case 

Let us briefly show how to remove the compactness assumption on /i and v, and how to 
relax the hypothesis supp(/i X v) C O. We assume /i, z/ G (so that Theorem 12.31 

applies), and that /i x z/(f2) = 1. Take an increasing sequence of compact sets Ki C SI 
such that U^iN-fQ = O. We consider 

i>i{x) := inf {^(a;, y) - c (y) | y s.t. (x, y) £ iQ} . 

Since now (f) c is not a priori continuous (and so d c ip£ is not necessarily closed), we first 
define 

(j) c e (y) := inf {d 2 SR (x, y) - ipi(x) \ x s.t. (x, y) G K ( ) , 
and then consider 

<j>l(x) : = inf {d 2 SR (x,y) - (j) c e (y) \ y s.t. (x,y) G JQ} . 

In this way the following properties holds (see for example the argument in the proof 
of [HHl Proposition 5.8]): 
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- (p£ and are both continuous; 

- ipe(x) > <f>(x) for all x G M; 

- <t> c {y) < My) for all y G tt 2 (1Q); 

- (ftg(x) = tpi(x) for all x G 7Ti(i^). 
This implies that d c (j) D -?Q C <9 c </>£, and so 

One can therefore prove (i) and (ii) in Theorem 13,21 with <pi in place of (ft, and from 
this and the hypothesis fi x v(Q) = 1 it is not difficult to deduce that (x,d c (ft(x)) D 
is a singleton for fi-a.e. x (see the argument in the proof of Theorem 13. 2|) . This proves 
existence and uniqueness of the optimal transport map. 

Although in this case we cannot hope for any semiconcavity result for (ft (since, as 
in the non-compact Riemannian case, (ft is just a Borel function), the above argument 
shows that the graph of the optimal transport map is contained in the union of d c (ft>£. 
Hence, as in [211 Section 5] one can use d c (fti to construct the (unique) Wasserstein 
geodesic between [i and v , and in this way the absolutely continuity of all measures 
belonging to the geodesic follows as in the compactly supported case. 

Finally, the fact that the graph of the optimal transport map is contained in 
Ui^d c (fte allows also to prove the approximate differentiability of the transport map 
and the Jacobian identity, provided that one replaces the hessian of (ft with the approx- 
imate hessian (we refer the reader to [231 Section 3] to see how this argument works in 
the Riemannian case). 

4 Examples 

The aim of the present section is to provide a list of examples where some of our 
theorems apply. For each kind of sub- Riemannian manifold that we present, we provide 
a regularity result for the associated squared sub-Riemannian distance function. We 
leave to the reader to check in each case which of our theorems holds under that 
regularity property. Before giving examples, we recall that, if A is a smooth distribution 
on M, a section of A is any smooth vector field X satisfying X(x) G A(x) for any 
x G M. For any smooth vector field Z on M and every x G M, we shall denote by 
[Z, A](x), [A, A](x), and [Z, [A, A]] the subspaces of T X M given by 

[Z,A](x) := {[Z,X](x) | X section of A} , 

[A, A] (a) := Span{[X,Y}(x) | X,Y sections of A} , 
[Z, [A, A]](x) := Span {[Z, [X, Y]](x) | X, Y sections of A} . 
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4.1 Fat distributions 



The distribution A is called fat if, for every x G M and every vector field X on M such 
that X(x) eA(i)\ {0}, there holds 

Ia,M = A(x) + [X, A](x). 

The above condition being very restrictive, there are very few fat distributions (see 
[30]). Fat distributions on three-dimensional manifolds are the rank- two distributions 
A satisfying 

T X M = SpaQ{/i(a;), / 2 (x), [fi, f 2 ](x)} Vx € M, 

where (/i,/2) is a 2-tuple of vector fields representing locally the distribution A. A 
classical example of fat distribution in 1R 3 is given by the distribution spanned by the 
vector fields 

d d d 

X ± = a — ' ^2 = ^ hXi- — . 

OX\ OX2 OX3 

This is the distribution appearing in the Heisenberg group (see [8, 9, 24J). It can be 
shown that, if A is a fat distribution, then any nontrivial (i.e. not constant) horizontal 
path with respect to A is nonsingular (see [TU [30l [33] ) . As a consequence, Theorems 
15.91 and 15.111 yield the following result: 



Proposition 4.1. If A is fat on M , then the squared sub-Riemannian distance function 
is locally Lipschitz on M x M and locally semiconcave on M X M\D. 



4.2 Two-generating distributions 

A distribution A is called two-generating if 

T X M = A(x) + [A, A](x) Vx G M. 

Any fat distribution is two-generating. Moreover, if the ambient manifold M has di- 
mension three, then any two- generating distribution is fat. The distribution A in H 4 
which is spanned by the vector fields 

d d d d 

— fl ' "*2 — o ) -^3 — -^ > 

OXi OX2 OX3 OX4 

provides an example of distribution which is two-generating but not fat. It is easy to see 
that, if the distribution is two-generating, then there are no Goh paths (see Paragraph 
15.91 for the definition of Goh path). As a consequence, by Theorem 15 . 1 1 1 we have: 

Proposition 4.2. If A is two-generating on M , then the squared sub-Riemannian 
distance function is locally Lipschitz on M x M . 

The above result and its consequences in optimal transport are due to Agrachev 
and Lee (see [3]). 
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4.3 Generic sub-Riemannian structures 

Let (M, g) be a complete Riemannian manifold of dimension > 4, and m > 3 be a 
positive integer. Denote by X> m the space of rank m distributions on M endowed 
with the Whitney C°° topology. Chitour, Jean and Trelat proved that there exists an 
open dense subset O m of V m such that every element of O m does not admit nontrivial 
minimizing singular paths (see [15} 116]). As a consequence, we have: 

Proposition 4.3. Let (M,g) be a complete Riemannian manifold of dimension > 4. 
Then, for any generic distribution of rank > 3, the squared sub-Riemannian distance 
function is locally semiconcave on M x M \D. 

This result implies in particular that, for generic sub-Riemannian manifolds, we 
have existence and uniqueness of optimal transport maps, and absolute continuity of 
Wasserstein geodesies. 



4.4 Nonholonomic distributions on three-dimensional manifolds 

Assume that M has dimension 3 and that A is a nonholonomic rank-two distribution 
on M, and define 

S A := {x E M | A{x) + [A, A](x) ^ R 3 } . 

The set Sa is called the singular set or the Martinet set of A. As an example, take 
the nonholonomic distribution A in H 3 which is spanned by the vector fields 

, d 8 2 d 
J 1= a—i h = a hx i^ — • 

OX\ 0X2 OX3 

It is easy to show that the singular set of A is the plane {x\ = 0}. This distribution 
is often called the Martinet distribution, and Sa the Martinet surface. The singular 
horizontal paths of A correspond to the horizontal paths which are included in Sa- 
This means that necessarily any singular horizontal path is, up to reparameterization, 
a restriction of an arc of the form t \— > (0, t, X3) G H 3 with X3 € H. This kind of 
result holds for any rank-two distribution in dimension three (we postpone its proof to 
Appendix B): 

Proposition 4.4. Let A be a nonholonomic distribution on a three-dimensional man- 
ifold. Then Sa is a closed subset of M which is countably 2-rectifiable. Moreover a 
nontrivial horizontal path 7 : [0, 1] — ► M is singular if and only if it is included in Sa- 

Proposition 14.41 implies that for any pair (x,y) € M x M (with x ^ y) such that x 
or y does not belong to Sa, any sub-Riemannian minimizing geodesic between x and 
y is nonsingular. As a consequence, thanks to Theorems 15.91 and 15.111 the following 
result holds: 

Proposition 4.5. Let A be a nonholonomic distribution on a three-dimensional man- 
ifold. The squared sub-Riemannian distance function is locally Lipschitz on M x M \ 
(Sa x Sa) and locally semiconcave on M x M \(D U Sa x Sa)- 

We observe that, since Sa is countably 2-rectifiable, for any pair of measures fj,, v G 
P C {M) such that fi gives no measure to countably 2-rectifiable sets, the conclusions of 
Theorem 13.31 hold. 
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4.5 Medium-fat distributions 

The distribution A is called medium-fat if, for every x G M and every vector field X 
on M such that X(z) € A(x) \ {0}, there holds 

T X M = A(z) + [A, A] Or) + [X, [A, A]] (a;). 

Any two-generating distribution is medium- fat. An example of medium- fat distribution 
which is not two-generating is given by the rank-three distribution in H 4 which is 
spanned by the vector vector fields 

fl = tt~ , h = a~, h = a h ( Xl + x 2 + X 3 ) ——. 

ax\ 0x2 0x3 0x4 

Medium-fat distribution were introduced by Agrachev and Sarychev in [5] (we refer the 
interested reader to that paper for a detailed study of this kind of distributions). It 
can easily be shown that medium-fat distributions do not admit nontrivial Goh paths. 
As a consequence, Theorem 1 5 . 1 1 1 yields : 

Proposition 4.6. Assume that A is medium-fat. Then the squared sub-Riemannian 
distance function is locally Lipschitz on M X M\D. 

Let us moreover observe that, given a medium-fat distribution, it can be shown 
that for a generic smooth complete Riemannian metric on M the distribution does not 
admit nontrivial singular sub-Riemannian minimizing geodesies (see [15^ I16j). As a 
consequence, we have: 

Proposition 4.7. Let A be a medium-fat distribution on M. Then, for "generic" Rie- 
mannian metrics, the squared sub-Riemannian distance function is locally semiconcave 
on M x M\D. 

Notice that, since two-generating distributions are medium- fat, the latter result 
holds for two- generating distributions. 



4.6 Codimension-one nonholonomic distributions 

Let M have dimension n, and A be a nonholonomic distribution of rank n — 1. As in 
the case of nonholonomic distributions on three-dimensional manifolds, we can define 
the singular set associated to the distribution as 

S A := {x € M I A(x) + [A, A] (x) + T X M} . 

The following result holds (we postpone its proof to Appendix B): 

Proposition 4.8. If A is a nonholonomic distribution of rank n — 1, then the set Sa 
is a closed subset of M which is countably (n — l)-rectifiable. Moreover any Goh path 
is contained in Sa- 

From Theorem 15.111 we have: 

Proposition 4.9. The squared sub-Riemannian distance function is locally Lipschitz 
on M x M\ (£ A x £ A ). 

Note that, as for medium- fat distributions, for generic metrics the function di R is 
locally semiconcave onMx¥\(DU S A X S A )- 
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4.7 Rank-two distributions in dimension four 



Let (M, A,g) be a complete sub-Riemannian manifold of dimension four, and let A be 
a regular rank-two distribution, that is 

T X M = Span{/ 1 (x),/ 2 (i) 1 [/i,/ 2 ](*), [fi, [/i,/ 2 ]](x), [/ 2j [/i,/ 2 ]](x)} 

for any local parametrization of the distribution. In [36] Sussmann shows that there 
is a smooth horizontal vector field X on M such that the singular horizontal curves 
7 parametrized by arc-length are exactly the integral curves of X, i.e. the curves 
satisfying 

7 (t)=X( 7 (i)). 

By the way, it can also be shown that those curves are locally minimizing between their 
end-points (see \27\ 136]). For every x € M, denote by 0{x) the orbit of x by the flow 
of X, and set 

O := {(x,y) eMxM\y(£ O(x)} . 
Sussmann's Theorem, together with Theorem 15.91 yields the following result: 

Proposition 4.10. Under the above assumption, the function d 2 SR is locally semicon- 
cave in the interior o/fL 

As an example, consider the distribution A in IR 4 spanned by the two vector fields 

d d d d 
J 1= a~i h = a hXl ^ hX3 ^ — • 

OXi 0X2 OX3 OX4 

It is easy to show that a horizontal path 7 : [0, 1] — ► R 4 is singular if and only if it 
satisfies, up to reparameterization by arc-length, 

7(*) = /i(7(<)) Vie [0,1]. 

By Proposition 14.101 we deduce that, for any complete metric g on R 4 , the function 
d? SR is locally semiconcave on the set 

n = {(x,t/) efi 4 xR 4 I (y-x) ^Span{ei}}, 

where e% denotes the first vector in the canonical basis of R 4 . Consequently, for any 
pair of measures fi € P" C (M), v € P C (M) satisfying supp(/i x v) C 0, Theorem 13.21 
applies (or more in general, if \i x i/(0) = 1, we can apply the argument in Paragraph 

K 



5 Facts in sub-Riemannian geometry 

Throughout this section (M, A, g) denotes a sub-Riemannian manifold of rank m < n, 
which is assumed to be complete with respect to the sub-Riemannian distance. As in 
the Riemannian case, the Hopf-Rinow Theorem holds. In particular any two points 
in M can be joined by a minimizing geodesies, and any sub-Riemannian ball of finite 
radius is a compact subset of M. We refer the reader to [301 Appendix D] for the 
proofs of those results. We present in the following subsections a list of basic facts in 
sub-Riemannian geometry, whose the proofs may be found in [30] and |33j . 
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5.1 Nonholonomic distributions vs. nonholonomic control systems 

Any nonholonomic distribution can be locally parameterized by a nonholonomic control 
system, that is by a smooth dynamical system with parameters called controls. Indeed, 
assume that V is an open subset of M such that there are m smooth vector fields 
/l)--->/m on V which parametrize the nonholonomic distribution A on V, that is 
which satisfy 

A(x) = Span{/i(x), . . . , f m (x)} Vx G V, 

and 

Ue{f 1 ,...,f m }(x)=T x M VxGV. 

Given x G V, there is a correspondence between the set of horizontal paths in Qa(x) 
which remain in V and the set of admissible controls of the control system 

m 

X = ^Uifi(x). 
i=l 

A control u G L 2 ([0, l],H m ) is called admissible with respect to x and V if the solution 
7x,n to the Cauchy problem 

m 

x(t) =^tii(t)/j(x(t)) for a.e. i G [0,1], x(0) = x, 
i=l 

is well-defined on [0, 1] and remains in V. The set U x of admissible controls is an open 
subset of L 2 ([0,l],R m ). 

Proposition 5.1. Given x G M , the mapping 

U x — > Qa(x) 
u i — > 7x,m 

is one-to-one. 

Given x G M, the end-point-mapping from x, from the control viewpoint, takes the 
following form 

E x :U X — ► M 

u i — > 7x,«(l) 

This mapping is smooth. The derivative of the end-point mapping from x at u G U x , 
that we shall denote by dE x (u), is given by 

dE x (u)(v) = d<S> u (l, x) £ (d$ u (t, x))- 1 (jT Vi^fi^it))^ dt Mv G L 2 ([0, 1], R m ), 
where <i> M (i, x) denotes the flow of the time-dependent vector field X u defined by 

m 

X u (t,x) :=^Ui(f)fi(x) for a.e. t G [0,1], Vx G V, 
i=l 

(note that the flow is well-defined in a neighborhood of x). We say that an admissible 
control u is singular with respect to x if dE x is singular at u. Observe that this is 
equivalent to say that its associated horizontal path is singular (see the definition of 
singular path given in Section [2]) . It is important to notice that the singularity of a 
given horizontal path does not depend on the metric but only on the distribution. 
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5.2 Characterization of singular horizontal paths 

Denote by u> the canonical symplectic form on T*M and by the annihilator of A 
in T*M minus its zero section. Define uJ as the restriction of u to A ± . An absolutely 
continuous curve ip '■ [0, 1] — * A -1 such that 

ip(t) € ker u;{ip(t)) for a.e. t £ [0, 1] 

is called an abnormal extremal of A. 

Proposition 5.2. A horizontal path 7 : [0, 1] — > M is singular if and only if it is the 
projection of an abnormal extremal tp of A. The curve ip is said to be an abnormal 
extremal lift 0/7. 

If the distribution is parametrized by a family of m smooth vector fields fi,...,f m 
on some open set V C M, and if in addition the cotangent bundle T*M is trivializable 
over V, then the singular controls, or equivalently the singular horizontal paths which 
are contained in V, can be characterized as follows. Define the pseudo-Hamiltonian 
H :Vx (R n )* x (R m ) 1 — > R by 



H (x,p,u) = ^2uip(fi(x)). 



Proposition 5.3. Let x € V and u be an admissible control with respect to x and 
V. Then, the control u is singular (with respect to x) if and only if there is an arc 
p : [0, 1] — ► (R n )* \ {0} in W 1 ' 2 such that the pair (x = r y x ,u,p) satisfies 



m = ^(x(t), P (t),u(t)) = z? =1 ^(t)h(x(t)) 

i>(t) = -^( x (t), P (t),u(t)) = -ZT=iMt)p(t)-df l (x(t)) 



(5.1) 



for a.e. t £ [0, 1] and 

p(t) ■ fi(x(t)) = Vte[0,l], Vi = l,...,m. (5.2) 

A control or a horizontal path which is singular is sometimes called abnormal. If it 
is not singular, we call it nonsingular or regular. 

5.3 Sub-Riemannian minimizing geodesies 

As we said in Section [21 since the metric space (M, dsn) is assumed to be complete, for 
every pair x, y 6 M there is a horizontal path 7 joining x to y such that 

dsR(x,y) = lengthy (7). 

If 7 is parametrized by arc-length, then using Cauchy-Schwarz inequality it is easy to 
show that 7 minimizes the quantity 

1 

9 J (t){i{t)n( t )) dt ='■ energy 9 ( 7 ), 
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over all horizontal paths joining x to y. This infimum, denoted by esn(x,y), is called 
the sub-Riemannian energy between x and y. Since M is assumed to be complete, 
the infimum is always attained, and the horizontal paths which minimize the sub- 
Riemannian energy are those which minimize the sub-Riemannian distance and which 
are parametrized by arc-length. In particular, one has 

es-flO, y) = d 2 SR (x, y) Mx, y G M. 

Assume from now that 7 is a given horizontal path minimizing the energy between x 
and y. Such a path is called a sub-Riemannian minimizing geodesic. Since 7 minimizes 
also the distance, it has no self intersection. Hence we can parametrize the distribution 
along 7: there is an open neighborhood V of 7QO, 1]) in M and an orthonormal family 
(with respect to the metric g) of m smooth vector fields fx, ■ ■ ■ , f m such that 

A(z) = Span {/1 (*),...,/„»(*)} VzGV. 

Moreover, since 7 belongs to W li2 ([0, 1], Af), there exists a control u 7 G L 2 ([0, l],R m ) 
(in fact, |« 7 (i)| 2 is constant), which is admissible with respect to x and V, such that 

m 

j(t) = Yl u](t)fi( 7 (t))dt for a.e. t G [0, 1]. 

i=l 

By the discussion above, we know that u 1 minimizes the quantity 

/ ff 7 .«(t)fy)^(*)/i(7*,«(*)),y;^(t)/i(7 a! ,«(t)))*= / Vu ! (i) 2 *=:cw, 
j ° ' w i=i j j ° i=i 

among all controls u G L 2 ([0, 1], H m ) which are admissible with respect to x and V, 
and which satisfy the constraint 

E x (u) = y. 

By the Lagrange Multiplier Theorem, there is A G (H n )* and Ao G {0, 1} such that 

A • dE x {u~t) - \ Q dC(ul) = 0. (5.3) 

Two cases may appear, either Ao = or Ao = 1. By restricting V if necessary, we 
can assume that the cotangent bundle T*M is trivializable with coordinates (x,p) G 
R n x (R n )* over V. 

First case: Ao = 0. The linear operator dE^-u 7 ) : L 2 ([0, 1], H m ) — > T y M cannot be 
onto, which means that the control u is necessarily singular. Hence there is an arc 
p : [0, 1] — ► (R n )* \ {0} in W 1 ' 2 satisfying (J5HJ) and ((5^ . In other terms, 7 = j x>u -y 
admits an abnormal extremal lift in T*M. We also says that 7 is an abnormal mini- 
mizing geodesic. 

Second case: Ao = 1. In local coordinates, the Hamiltonian H (defined in (|2.2|) ) takes 
the following form: 

H(x,p) = -^(p ■ fi{x)) 2 = maxl^Uip ■ fi(x) - ~ XX f ( 5 ' 4 ) 

8=1 " e M=l 8=1 ' 

for all (x,p) G V x (R n )*. Then the following result holds: 



19 



Proposition 5.4. Equality (5.3\) with Xq = 1 yields the existence of an arc p : [0, 1] — ► 

(]R n )* in W 1 ' 2 , with p(l) = 4, swc/i i/iai ifoe pair (7 = j x ,u-y,p) satisfies 

= ^(7(*).J»(*)) = E^iljp(*)-/i(7(*))]/i(7(*)) 
U*) = -S(7(*),p(*)) = -I^|p(*)-/*(7(t))]p(*)-4fi(7(*)) l ' j 

/or a. e. t G [0, 1] and 

«7(*)=P(*)-/i(7(*)) /ora.e. iG [0,1], Vi = l,...,m. (5.6) 

In particular, the path 7 is smooth on [0, 1] . TTie curve 7 and t/ie control u 1 are called 
normal. 

The curve V> : [0,1] — ► T*M given by if)(t) = (7(t),p(f)) for every i G [0,1] is a 
normal extremal whose the projection is 7 and which satisfies ^(1) = (y, We say 
that ^ is a normal extremal lift of 7. We also say that 7 is a normal minimizing geodesic. 

To summarize, the minimizing geodesic (or equivalently the minimizing control it 7 ) 
is either abnormal or normal. Note that it could be both normal and abnormal. For 
decades the prevailing wisdom was that every sub-Riemannian minimizing geodesic is 
normal, meaning that it admits a normal extremal lift. In 1991, Montgomery found 
the first counterexample to this assertion (see |29[ [30] ) . 

5.4 The sub-Riemannian exponential mapping 

Let x G M be fixed. The sub-Riemannian exponential mapping from x is defined by 

exp x :T*M — ► M 

p 1 — > 7r(V>(l)), 

where tp is the normal extremal so that ^(0) = (x,p) in local coordinates. Note that 
H(ip(t)) is constant along a normal extremal tp, hence we have 

energy 9 (7r(V0) = (length^V))) 2 = 2fT(^(0)). 

The exponential mapping is not necessarily onto. However, since (M, dsn) is complete, 
the image of the exponential mapping, exp x (T*M) can be shown to contain an open 
dense subset of M. This result, which was obtained recently by Agrachev (see [2]), is 
a consequence of the following fact (which appeared in [34] . see also [3]), which is also 
crucial in the proofs of Theorems I3.2[ 13.31 

Proposition 5.5. Let y G M , and assume that there is a function 4> '■ M — > M 
differ entiable at y such that 

4>(y) = d 2 SR (x,y) and d 2 SR (x, z) > 4>(z) Vz G M. 

Then there exists a unique minimizing geodesic between x and y, which is the projection 
of the normal extremal ip : [0,1] — > T*M satisfying = (y, ^d(j)(y)). In particular 
x = expj,(-|d0(y)). 
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5.5 The horizontal eikonal equation 

As in the Riemannian case, the sub-Riemannian distance function from a given point 
satisfies a Hamilton-Jacobi equation. This fact is important for the proof of Theorem 
13.31 Let us first recall the definition of viscosity solution: 

Definition 5.6. Let F : T*M x R -> R be a given continuous function, and let U an 
open subset of M. A continuous function u : U — > 1R is said to be a viscosity subsolution 
on U of the Hamilton-Jacobi equation 

F(x,du(x),u(x)) = (5.7) 

if and only if, for every C 1 function <fi : U — > H satisfying <ft > u we have 

Vx G U, 4>{x) = u(x) =>■ F(x, d(j)(x), u{x)) < 0. 

Similarly, a continuous function u : U — > 1R is said to be a viscosity supersolution of 
(|5.7|) on C7 if and only if, for every C 1 function ip : U — > 1R satisfying ip < u we have, 

Vi G U, ip(x) = u(x) =^r- F(x, dip(x), u(x)) > 0. 

A continuous function u : U — > H is called a viscosity solution of (|5.7|) on C7 if it is 
both a viscosity subsolution and a viscosity supersolution of (15. 7p on [/. 

Proposition 5.7. For every x G M i/ie function /(•) = dsn(x, •) is a viscosity solution 
of the Hamilton-Jacobi equation 

H(y,df(y)) = ± VyGM\{x}. (5.8) 

5.6 Compactness of minimizing geodesies 

The compactness of minimizing curves is crucial to prove regularity properties of the 
sub-Riemannian distance. Let us denote by W^' 2 ([0, l]>-^0 the set of horizontal paths 
7 : [0, 1] -> M endowed with the ^ 1,2 -topology. For every 7 G W^' 2 ([0, 1], M), the 
energy of 7 with respect to g is well-defined. The classical compactness result taken 
from Agrachev pQ reads as follows: 

Proposition 5.8. For every compact K C M , the set 

K. := {7 6 W^ 2 ([0,l],M) I 3x,y G K with e SR (x,y) = energy 9 ( 7 )} 
is a compact subset o/TF 1,2 ([0, 1],M). 

5.7 Local semiconcavity of the sub-Riemannian distance 

As we said in Section [2, the sub-Riemannian distance can be shown to be locally 
Holder continuous on M x M, but in general it has no reason to be more regular. 
Within the next sections, we are going to show that, under appropriate assumptions 
on the sub-Riemannian structure, dsR enjoyes more regularity properties, such as local 
semiconcavity or locally Lipschitz regularity. 

Recall that D denotes the diagonal ofMxM, that is the set of all pairs of the form 
(x,x) with x G M. Thanks to Proposition 15.81 the following result holds: 
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Theorem 5.9. Let f2 be an open subset of M x M such that, for every pair (x,y) G SI 
with i/j/, any minimizing geodesic between x and y is nonsingular. Then the distance 
function dsn (or equivalently d 2 SR ) is locally semiconcave on Q \ D. 

Since Theorem 15.91 plays a crucial role in the present paper and does not appear in 
this general form in [13] , we prefer to give a sketch of its proof. We refer the reader to 
[T3l l33l for more details. 



Proof. Let us fix (x, y) £ f2 \ D and show that d$R is semiconcave in a neighborhood 
of (x,y) in M x M \ D. Let U x and U y be two compact neighborhoods of x and y 
such that U x x U y C £1 \ D. Denote by fC the set of minimizing horizontal paths 



7 in W^ 2 ([0, l],R m ) such that 7(0) G U x and 7(1) G Uy. Thanks to Proposition 



15.81 K, is a compact subset of W 1,2 ([0, 1], M). Let (x',y') £ U x x U y be fixed. Since 
(M, dsn) is assumed to be complete, there exists a sub-Riemannian minimizing geodesic 
j x ' y i between x' and y' . Moreover by assumption it is nonsingular. As before we can 
parametrize A by a family of smooth orthonormal vector fields along j x ' y'i an d we 
denote by u x ,y the control in L 2 ([0, 1], M m ) corresponding to Jx',y'- Since u x ,y is 
nonsingular, there are n linearly independent controls v\ ' y 
such that the linear operator 



,v£' y ' in L 2 ([0,l],R m ) 



£ x * : R n ■ 
a t 

is invertible. Set 

jrx',y' . R n xR n 

(z,a) 



> R"xE n 



z, E. 



i=l 



OLiV- 



This mapping is well-defined and smooth in a neighborhood of (x',0), satisfies 

r x '>y'(x',o) = (x', y '), 

and its differential at (x',0) is invertible. Hence, by the Inverse Function Theorem, 
there are an open ball B x ' y centered at (x', y') in H n x R n and a function Q x ' y : 
B x '' y ' -^E"xR" such that 



jrx,y Q gx,y = 

-1 



V(z,w) G B x ' y 



Denote by [a x ,y ' J the second component of Q x ,y . From the definition of the sub- 
Riemannian energy between two points we infer that for any (z, w) G B x ' ,y ' we have 

2 



esn(z,w) < 



u 



+ 



Set 



4> x ' y (z,w) 



i=l 



Q: 



a 



-1 



L 2 



V(z,ti;) eB x > y 



1? 
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We conclude that, for every (x', y') G IA X xU y , there is a smooth function cf> x '' y ' such that 
dsR( z , w ) < <t> x ' y ( z , w ) f° r an y ( Z 7 W ) m & x ' y ■ By compactness of fC and thanks to a 
quantitative version of the Inverse Function Theorem, the C 1 ' norms of the functions 
(j) x ' y are uniformly bounded and the radii of the balls B x ,y are uniformly bounded 
from below by a positive constant for x',y' in U x xU y . Then the result follows from 
Lemma lA.li □ 

5.8 Sub-Riemannian cut locus 

For every x G M the singular set of dsn(x, •), denoted by E (dsn(x, •)), is defined as 
the set of points y / x G M where dsR(x, •) (or equivalently dg R ) is not continuously 
differentiable. The cut-locus of x is defined as 

Cutsn(x) := E {dsn(x, •)) 

and the global cut-locus of M as 

Cut 5 ii(M) := {(x,y) € M | y G Cut 5i? (x)} . 

In contrast with the Riemannian case, the sub-Riemannian global cut-locus of M always 
contains the diagonal (see [T]). A covector p € T*M is said to be conjugate with respect 
to x £ M if the mapping exp x is singular at p, that is if dexp x (p) is singular. For every 
x G M we denote by Conj min (x) the set of points y G M \ {x} for which there is 
p G T*M which is conjugate with respect to x, and such that 

exp x (p) = y and e S n{x,y) = 2H(x,p). 

The following result holds (see [35l [33] ) : 

Proposition 5.10. Let f2 be an open subset of M x M. Assume that Q is totally 
geodesically convex and that the sub-Riemannian distance is locally semiconcave on 
0,\D. Then, for every x G M , we have 

({x} x Cut SR (x)) nn= ({x} x (e (d SR (x, •)) u Conj min (x) u {x})) n n. 

Moreover, the set ({x} x CutsR^x)) n /ias Hausdorff dimension < n — 1, and the 
function dsR is of class C°° on the open set \ Cutsn(M). 

An important property of the Riemannian distance function is that it fails to be 
semiconvex at the cut locus (see [191 Proposition 2.5]). This property plays a key role in 
the proof of the differentiability of the transport map. We do not know if that property 
holds in the sub-Riemannian case: 

Open problem. Assume that dsR is locally semiconcave on M x M \ D. Let 

x, y G M, and assume that there exists a function (j) : M — ► H twice differentiable at y 
such that 

4>(y) = d 2 SR (x, y) and d 2 SR (x, z) > 4>(z) Vz G M. 
Is it true that y G" Cut gi?(x)? 
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5.9 Locally lipschitz regularity of the sub-Riemannian distance 

Since any locally semiconcave function is locally Lipschitz, Theorem 15.91 above gives 
a sufficient condition that insures the Lipschitz regularity of dg R out of the diagonal. 
In [3] Agrachev and Lee show that, under some stronger assumption, one can prove 
global Lipschitz regularity. A horizontal path 7 : [0, 1] — ► M will be called a Goh path 
if it admits an abnormal lift ip : [0, 1] — > A -1- which annihilates [A, A], that is, for every 
t G [0, 1] and every local parametrization of A by smooth vector fields /i,...,/ m in a 
neighborhood of 7(t), we have 

■ (ifiJjMt))) =0 Vi,j = l,...,m. 

Note that if the path 7 is constant on [0, 1], it is a Goh path if and only if there is a 
differential form p G T*^M satisfying 

V ■ /i(7(0)) = P ■ [A, /i](7(0)) =0 Vi, j = 1, . . . , m, 

where f%, . . . , f m is as above a parametrization of A in a neighborhood of 7(0). Agrachev 
and Lee proved the following result (see Theorem 5.5]): 

Theorem 5.11. Let £1 be an open subset of M x M such that any sub-Riemannian 
minimizing geodesic joining two points of ft is not a Goh path. Then the function dg R 
is locally Lipschitz on SI x ft. 

6 Proofs of the results 
6.1 Proof of Theorem 13^1 

Let us first prove (i). We easily see that coincides with the set 

{x G M I 4>(x) + cf) c (x) < 0}. 

Thus, since both <fi and (j) c are continuous, is open. Let us now prove that cj) is locally 
semiconcave (resp. locally Lipschitz) in an open neighborhood of A4^ H supp(/i). Let 
x € M.^ nsupp(//) be fixed. Since x g" d c (f)(x), there is r > such that dsn(x,y) > r for 
any y £ d c (j){x). In addition, since the set d c cj) is closed inMxM and supp(/i X v) C 
there exists a neighborhood V x of x which is included in Ai® (~1 tti(O) and such that 

d S n(x,w)>r Vz € V x , Vw G d c (p(z). 

Let (j) x r : M — > 1R be the function defined by 

4>xA z ) := mf {dsR&v) - <A C (y) I V e SU PP(^), d S n{z,y) > r) . 

We recall that supp(/U x v) C $7 and that dan ls locally semiconcave (resp. locally 
Lipschitz) in Q \ D. Thus, up to considering a smaller V x , we easily get that the 
function <p Xtr is locally semiconcave (resp. locally Lipschitz) in V x . Since <j) = (j) XyT in 
V x , (i) is proved. 

To prove (ii), we observe that it suffices to show the result for x belonging to an open 
set V C M on which the horizontal distribution A(x) is parametrized by a orthonormal 
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family a smooth vector fields {fx, . . . , f m }- Moreover, up to working in charts, we can 
assume that V is a subset of M n . 

First of all we remark that, since all functions z i— ► d 2 SR {z,y) — <j) c (y) are locally 
uniformly Lipschitz with respect to the sub-Riemannian distance when y varies in 
a compact set, also 4> is locally Lipschitz with respect to dsn- Up to a change of 
coordinates in K n , we can assume that the vector fields /j are of the form 

d - d 
fi = dx~ + ai ^d^T. Vi = l,...,m, 

* j=m+l 3 

with Oij G C°°(lR n ). Therefore, thanks to [31, Theorem 3.2], for a.e. x G V, <j> is 
differentiable with respect to all vector fields /» for a.e. x G V, and 

m 

0(y) - 4>(x) - ^2 fi^(. x )(Vi ~ x i) = °{ d SR{x, yj) Vy G V. (6.1) 
i=i 

Recalling that \i is absolutely continuous, we get that (|6.ip holds at //-a.e. x G V. Thus 
it suffices to prove that d c (j)(x) = {x} for all such points. 
Let us fix such an x. We claim that 

f i( j)(x) = Vi = l,---m. (6.2) 

Indeed, fix i G {1, • • • , m} and denote by jf(t) : (— e, e) — > Af the integral curve of the 
vector field fa starting from x, i.e. 

it (t) = hint (*)) vt €(-€,€) 

7f(0)=x. 

By the assumption on x, there is a real number ^ such that 

a., «*(«)) 

t— o i 

By construction, the curve 7? is horizontal with respect to A. Thus, since <?(7f (t), 7f (i)) : 
1 for any t, we have 

d SR (x,jt(t)) < |t| ViG (-£,£). 

This gives 

0(7fW) < 0(x) +4 fi ( 7 r(t),x) < 0(x) +t 2 , 

which implies that li = and proves the claim. 

Assume now by contradiction that there exists a point y G d c (p(x) \ {x}, with 
(x, y) G Q. Then the function 

z i-> 4>(z) - d 2 SR (z,y) < cf> c (x) 

attains a maximum at x. Let j XjV : [0, 1] — > M denotes a minimizing geodesic from x 
to y. Then 

HlxA*)) - d 2 SR ( 1XjV (t), y) < <P(x) - d 2 SR (x, y) Vt G [0, 1], 
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or equivalently 

0(7»,v(*)) " <K*) < d 2 SR {l x , y {t),y) - d 2 SR (x,y) Vi G [0, 1]. 

Observe now that, by (|6.ip together with (|6.2p . we have 

<t>{lx, y {t)) ~ 4>(x) = o(ds R ('j Xjy (t),x)) = o(tdsn{x,y)). 

On the other hand, dg R ('y Xt y(t),y) = (1 — t) 2 d 2 SR {x,y). Combining all together, for all 
t G [0, 1] we have 

o(tdsn(x, y)) = <j){~f x ,y(t)) - <p{x) < d 2 SR (j Xiy (t),y) - d 2 SR (x, y) 

= -2td 2 SR (x, y) + o(td SR (x, y)) , 



As x ^ y, this is absurd for t small enough, and the proof of (ii) is completed. 

Since supp(^ x u) C O, we immediately have that any optimal plan 7 is concentrated 
on d c (f) n S7. Moreover, combining (i) and (ii), we obtain that d c cj)(x)) n supp(z^) is 
a singleton for //-a.e. x. This easily gives existence and uniqueness of the optimal 
transport map. 

To prove the formula for T(x), we have to show that 



for all x G .M^R supp(/u) where </> is differentiable. This is a consequence of Proposition 
15.51 applied to the function z 1— > 4>(z) + 4> c (y) at the point x. Moreover, again by 
Proposition 15.51 the geodesic from x to T(x) is unique for fi-a.e. x G M.^ n supp(/i). 
Since T(x) = x for i£5^fl supp(^), the geodesic is clearly unique also in this case. 

6.2 Proof of Theorem RT51 

We will prove only (ii) , as all the rest follows as in the proof of Theorem 13.21 
Let us consider the "bad" set defined by 



We have to show that B is ^-negligible. For each k G IN, we consider the sequence of 
function constructed as follows: 



Since supp(/U x u) C f2 and d 2 SR is locally semiconcave in Q, \ D, the functions (ftk are 
locally semiconcave in a neighborhood of B. 

Thus, by Theorem IA.4I and the assumptions on there exists a Borel set G, with 
fJ>(G) = 1, such that all (j)^ are differentiable in G. Since for any x G B there exists 
y G d c cj}(x) \ {x} such that dsn{y,x) > 1/k for some k, we deduce that 



that is 



2td 2 SR (x,y) < o{td SR (x,y)) Vi G [0,1]. 




B := {x G 5* n supp(^) I {d c (j){x) \ {x}) n supp(^) + 0}. 



:= inf {d 2 SR (x,y) - (j) c (y) \ y G supp(z^), d S n{x,y) > 1/k} . 



\JW = A} = B. 



fceN 
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This gives that, up to set of /i-measure zero, B coincides with UketjAk, where 

A k := £ n {</> = fc } n G. 

Hence, to conclude the proof, it suffices to show that fi(Ak) = for all k G IN. 
Let x G A k . Then, if y G d c (p(x) and dsn{x,y) > 1/k, the function 

z i ^ ^(z) - <4i?( z ' v) ^ <^ c ( x ) ( 6 - 3 ) 

attains a maximum at x. Therefore, if we show that d(pk(x) = for fi-a.e. x G A^., 
equation (16.30 together with the semiconcavity of d 2 SR {z,y) for z close to x would im- 
ply that d 2 SR (-,y) is differentiable at x, and its differential is equal to 0. This would 
contradict Proposition 15.71 concluding the proof. Therefore we just need to show that 
d<fik(x) = [A-a.e. in A^. 

Let X be a smooth section of A such that g x (X(x), X(x)) = 1 for any x G M. We 
claim the following: 

Claim 1: for /U-a.e. x G Af., d(j>}-(x) ■ X(x) < 0. 

Since we can apply Claim 1 with a countable set of vector fields {^Qj^eiN such that 
{Xi(x)}i e ^f is dense in A(x) for all x G supp(^), Claim 1 clearly implies that d(j)k(x) = 
^i-a.e. in A k . Let us prove the claim. 

Let d g denote the Riemannian distance associated to the Riemannian metric g, and 
6(x, t) denote the flow of X, that is the function 9 : M x M — > M satisfying 

j t 6{x, t) = X{9{x, t)), 9(x, 0) = x. 
Fix e > small, and consider the "cone" around the curve t i— > 9(x,t) given by 

C| := jy G O | 3i G [0,e] such that d g (9(x,t),y)< et\. 
Moreover we define 

i? e := {x G supp(/i) n A k | 4nQ = {x}}. 
Claim 2: i? e is countably (n — l)-rectifiable for any e > 0. 

Indeed, since the statement is local, we can assume that we are in ]R n , Moreover, 
since X is smooth, we can assume that there exists v G H n such that C% contains the 
"euclidean cone" 

C e J 2 := jy G tt | 3t G [0,s/2] such that \x + tv - y\ < co~*}, 
where Co > 0. Thus it suffices to prove that 

Re/2 ■= {x g supp(^) nA k \A k n C £ J 2 = {x}} 

is (n — l)-rectifiable for any e > 0. 

Assume now that z, z' G R £ /2, with z ^ z' . Then, since z £ C £ J 2 \ we have 

\z' + tv - z\ > co|t ViG[0,e/2], 
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or equivalently 

\z - tv - z'\ > c |t ViG[0,e/2]. 

This implies that 

z £ C e J 2 ' := jy G tt \ 3t G [0,e/2] such that |x - - y| < co|*}- 
Since z, z' G -R e /2 were arbitrary, we have proved that for all z G -R e / 2 

i? £/2 n (Cf/ 2 u c;A-) = {z}. 

By |14l Theorem 4.1.6] R e is countably (n — l)-rectifiable for any e > 0, and this con- 
cludes the proof of Claim 2. 

Let us come back to the proof of Claim 1. Thanks to Claim 2 we just need to show 
that 

x G (supp(/x) n A k ) \ (UjRi/j) d<j) k (x) ■ X(x) < 0. 

Let x G (supp(/i) n Aj,) \ (UjRi/j). Then <j){x) = 4>k{x), and there exists a sequence 

of points {xj} such that Xj ^ x and Xj G Af, n for all j G IN. In particular 

4>(xj) = 4>k(xj) for all j G N. Since x G <S^, we have x G d c (j)(x), and so 

00) - < x) Mz G M. 

Let tj G [0, j] be such that d g (9(x,tj),Xj^< jtj. Then, since d 2 SR is locally Lipschitz, 
we get 

4>k(xj) - 4>k{x) = <t>{xj) - <t>{x) < d 2 SR (xj,x) 

<2d 2 R (9(x,t j ),x j )+2d 2 SR (9(x,t j ),x) 

< Cd g (0(x, tj),Xj) +2d 2 SR {0(x, tj),x) 

< jt j + 2d 2 SR (8(x,t j ),x). 

We now observe that, since X is a unitary horizontal vector field, dsR(0(x,tj),x) < tj. 
Moreover tj = d g (xj,x) + o(d g (xj,x)) as j — ► oo. Therefore, up to subsequences, one 
easily gets (looking everything in charts) 

■ lim ITr ~\ = 

J^ + OO dg(Xj,X) 

which implies 

d(j) k (x) ■ X(x) < 0, 

as wanted. 
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6.3 Proof of Theorem 1331 

Let us first prove the uniqueness of the Wasserstein geodesic. A basic representation 
theorem (see |38[ Corollary 7.22]) states that any Wasserstein geodesic necessarily takes 
the form m = (et)#II, where IT is a probability measure on the set T of minimizing 
geodesies [0, 1] — ► M, and &t '■ T — > M is the evaluation at time t: et{i) := 7(i). Thus 
uniqueness follows easily from Theorem 13.21 

The proof of the absolute continuity of m is done as follows. Fix t G (0,1), and 
define the functions 

fc-t(*):= inf, l^f^-^y) 

y£supp(v) I 1 — I 



4> C M ■= inf. 

zGsuppyt) 



d 2 SR {x,y) 



It is not difficult to see that 



Indeed, for all e > 0, 

d 2 SR (x,y) < (dsn(x,z) +d S n(z,y)) 2 < (1 + e)d 2 SR (x, z) + (l + 

Choosing e > so that 1 + e = l/t, (|6.4|) follows. Since 0(a?) + 4> c (y) < d 2 SR (x,y) for 
all x £ supp(/i) and y G supp(z^), by (|6.4[) we get 

MMl! _ + ^fifo*) _ 0(a;) j > Vx G supp(^), y G supp(^), z 6 M. 

This implies 

fa-t(z) + $(z) > Vz G M. (6.5) 

We now remark that (|6.4|) becomes an equality if and only if there exists a geodesic 
7 : [0, 1] — ► M joining x to y such that z = j(t). Hence by the definition of T t (x) we 
get 

d S R(x,T t (x)) 2 d SR {T t (x),T(x)) 2 2 

1 — = d SR (x,T{x)) for /i-a.e. x. (6.6) 

Moreover, since 



we obtain 

or equivalently 



(x) + (p c (T(x)) = d 2 SR (x, T(x)) for />a.e. x, 
0i_t(T t (ar)) + ^(T 4 (x)) = for //-a.e. x, 

<t>i-t( z ) + = for Mt-a.e. z. (6.7) 
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Let us now decompose the set M.^ n supp(/x) as 

A k := {xeM^n supp(M) I d SR (x, y) > l/k Vy G d c ^(x)}. 
Since T t (x) = x on 5^ n supp(//), denning fj,f := /u$|_T t (.A fc ) we have 

^ = ( u ^) u ^L (s , nsuppW ) Vie [o,i]. 

Thus it suffices to prove that ^ is absolutely continuous for each k G M. 
We consider the functions 

M _ t (aO := ipfj ^y - <p c (y) \ y G suppfc), d Si? (x,y) > (1 - *)/*} • 

<^(y) := bf/ ^a^ - 0(x) | y G suppH, d SR (x,y) > t/k\ . 

Since dsn(x, T(x)) > l/k for x G Afc, they coincide respectively with and 0£ inside 
Tt(Afc). Thus, thanks to (16, 5\\ and (16. 7D we have 

<Pk,i-t(z) + $, t (*) > 0i-tW + > Vz G M, 

with equality /ij-a.e. on T^(Aj.). 

Observe now that, by the compactness of the supports of fj, and z/, and the fact 
that $7 is totally geodesically convex, supp(/i x fi t ) and supp(/Ui x z/) are compact and 
contained in Q. Thus, since d 2 SR is locally semiconcave on Q, \ D, both functions <ftk,i-t 
and ^ t are locally semiconcave in a neighborhood of Tt{A k ). It follows from |2H 
Theorem A. 19] that both differentials d(fik t(z), d<\>\ i_ t (z) exist and are equal for fi-a.e. 
z G T s {Ak). Moreover, again by [21] Theorem A. 19], the map z i— > d(pk,t(z) = rf^fc i-t( z ) 
is locally Lipschitz on T s (Ak). Since for x G A^ we have 

0fe,t(') ^ rf5jR ^' 4>( x ) on I z) > t/k} 

with equality at T t (x) for //-a.e. x G ^4^, by Proposition 15.51 we get 

x = exp Tt(x) (-l(i0 fci j(r t (x))) for fi-a.e. x G 

Denoting by <&t : T*M — > T*M the Euler-Lagrange flow (i.e. the flow of the Hamilto- 
nian vector field H), we see that the map 

F i)fc (» := exp 2 .(-|(i^fc j t(z)) = ~d4>k,t( z )) 

is locally Lipschitz on supp(^) Pi Tt(Ak). Therefore it is clear that ji^ cannot have a 
singular part with respect to the volume measure, since otherwise the same would be 
true for (Ft t k)#(^t) = MLA*,- This concludes the proof of the absolute continuity. 
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6.4 Proof of Theorem 15771 



We recall that, by Theorem l3,2l the function </> is locally semiconcave in a neighborhood 
of n supp(/x). Thus, since /x is absolutely continuous with respect to the volume 
measure, by Theorem IA.5I d6(x) is differentiable for //-a.e. x G M.^ n supp(/u). By 
Theorem 13.21 for fi-a.e. x there exists a unique minimizing geodesic between x and 
T(x). Thanks to our assumptions this implies that T(x) = exp a ,(— ^d<p(x)) do not 
belongs to Cuts^(x) for //-a.e. x G .M^ PI supp(/x). Hence Proposition 15.101 implies that 
the function 

(z,w) i-> d 2 SR (z,w) 

is smooth near (x,T(x)). Exactly as in the Riemannian case, this gives that the 
map x i— > ex.p x (—7}d<p(x)) is differentiable for /x-a.e. x, and its differential is given by 
Y(x)[H(x) — ^Hess^c/)) (see [191 Proposition 4.1]). On the other hand, since T(x) = x 
for x G nsupp(^), it is clear by Definition 13.61 that T is approximately differentiable 
/i-a.e. in D supp(//), and that its approximate differential is given by the identity 
matrix /. This proves the first part of the theorem. 

To prove the change of variable formula, we first remark that, since both [i and v 
are absolutely continuous, there exists also an optimal transport map S from v to fj,, 
and it is well-known that S is an inverse for T a.e., that is 

SoT = Id fi-a.e., T o S = Id v-a.e. 

(see for instance [TJ Remark 6.2.11]). This gives in particular that T is a.e. injective. 
Applying [TJ Lemma 5.5.3] (whose proof is in the Euclidean case, but still works on 
a manifold) we deduce that | det(dT(x))\ > /u-a.e., and that the Jacobian identity 
holds. 

A Locally semiconcave functions 

The aim of this section is to recall some basic facts on semiconcavity. Throughout this 
section, M denotes a smooth connected manifold of dimension n. 

For an introduction to semiconcavity, we refer the reader to [H] and [211 Appendix 
A]. A function u : U — > H, defined on the open set U C M, is called locally semiconcave 
on U if for every x E U there exist a neighborhood U x of x and a smooth diffeomorphism 
fx '■ U x ^> (p x (U x ) C R™ such that / o tp~ l is locally semiconcave on the open subset 
U x = <fx(U x ) C R n . We recall that the function u : U — > H, defined on the open set 
U C R n , is locally semiconcave on U if for every x G U there exist C, 5 > such that 

Hu(y) + (1 - n)u{x) - u(fix + (1 - n)y) < fi(l - n)C\x - y\ 2 , (A.l) 

for all x, y in the ball B§(x) and every \x G [0, 1]. This is equivalent to say that the 
function u can be written locally as 

u(x) = (u{x) - C\x\ 2 ) + C\x\ 2 Vx G B s {x), 

with u(x) — C\x\ 2 concave. Note that every locally semiconcave function is locally 
Lipschitz on its domain, and thus by Rademacher's Theorem it is differentiable almost 
everywhere on its domain (in fact a better result holds, see Theorem lA.4p . The following 
result will be useful in the proof of our theorems. 
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Lemma A.l. Let u : U — > M be a function defined on an open set U C M n . Assume 
that for every x 6 U there exist a neighborhood V C U of x and a positive real number 
a such that, for every x £ V, there is p x G M" such that 

u(y) <u{x) + {p x ,y-x) + a\y-x\ 2 Vy G V. (A.2) 

Then the function u is locally semiconcave on U . 

Proof. Let x € U be fixed and V be the neighborhood given by assumption. Without 
loss of generality, we can assume that V is an open ball B. Let x,y £ B and [i G [0, 1]. 
The point x := fix + (1 — fi)y belongs to B. By assumption, there exists p G K n such 
that 

u(z) < u(x) + (p, z — x) + a\z — x\ 2 \/z G B. 

Hence we easily get 

fiu(y) + (1 — [J>)u{x) < u{x) + fia\x — x\ 2 + (1 — fi)a\y — x\ 2 

< u(x) + (fi(l — n) 2 o + (1 — fi)fj, 2 a) \x — y\ 2 

< u(x) + 2//(l — fi)a\x — y\ 2 , 

and the conclusion follows. □ 

Another useful result is the following (see [141 Corollary 3.3.8]): 

Proposition A.2. Let u : U — > M be a function defined on an open set U C M . If 
both functions u and —u are locally semiconcave on U, then u is of class C, ' on U. 

Fathi generalized the proposition above as follows (see [20] or [211 Theorem A. 19]): 

Proposition A. 3. Let U be an open subset of M and ui,U2 : U — > M be two functions 
with u± and —U2 locally semiconcave on U . Assume that u\{x) < u^ix) for any x G U. 
If we define £ = {x G U \ u\(x) = U2(x)}, then both u\ and U2 are differentiate at each 
x G £ with dui(x) = du2(x) at such a point. Moreover, the map x t— > du\{x) = du2(x) 
is locally Lipschitz on £. 

A.l Singular sets of semiconcave functions 

Let u : U — > H be a function which is locally semiconcave on the open set U C M. We 
recall that, since such a function is locally Lipschitz on U, its limiting subdifferential 
is always nonempty on U . We define the singular set of u as the subset of U 

S(tt) := {x G U | u is not differentiate at x} . 

From Rademacher's theorem, X(ii) has Lebesgue measure zero. In fact, the following 
result holds (see [HEH]): 

Theorem A. 4. Let U be an open subset of M. The singular set of a locally semiconcave 
function u : U — > M is countably (n — l)-rectifiable, i.e. is contained in a countable 
union of locally Lipschitz hyper surf aces of M . 
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A. 2 Alexandrov's second differentiability theorem 

As shown by Alexandrov (see [38]), locally semiconcave functions are two times differ- 
entiable almost everywhere. 

Theorem A. 5. Let U be an open subset of M n and u : U — > 1R be a function which is 
locally semiconcave on U. Then, for a.e. x G U, u is differentiate at x and there exists 
a symmetric operator A(x) : M n — > M n such that the following property is satisfied: 

Um u(x + tv) - u(x) - tdujx) ■ v - y(A(x) ■ v, v) ^ n 
no t 2 

Moreover, du(x) is differentiate a.e. in U, and its differential is given by A(x). 



B Proofs of auxiliary results 
B.l Proof of Proposition I4T41 

The first part of the proposition is just a corollary of Proposition 14.81 for n = 3. Let us 
prove the second part of the proposition. Let 7 : [0, 1] — > M be a nontrivial singular 
horizontal path. Our aim is to show that, for every t £ [0,1], the point j(t) belongs 
to Ea- Fix i G [0,1] and parametrize the distribution by two smooth vector fields 
fi , /2 in an open neig hborhood V of 7(f). Let u G L 2 ([0, 1],R 2 ), and let I be an open 
subinterval of [0, 1] containing i such that 

7 (i) = u 1 (t)/i(7(t)) + «2(*)/a(7(*)) for a.e. i G /. 

Note that since 7 is assumed to be nontrivial, we can assume that u is not identically 
zero in any neighborhood of i. From Proposition 15.31 there is an arc p : [0, 1] — ► 
(H 3 )* \ {0} in W 1 ' 2 such that 

p{t) = -u x {t)p{t) ■ dfxi^t)) - u 2 (t)p(t) ■ df 2 (y(t)) for a.e. t G I, 

and 

p(t) ■ fi(i(t)) = P (t) ■ f 2 (7(t)) = Vtel. 

Let us take the derivative of the quantity p{t) ■ f\ (^(t)) (which is absolutely continuous) . 
We have for almost every t G I, 

= |[p(i)-/i(7(*))] 
= p(t)-f 1 ( 1 (t))+p(t)-df 1 ( 1 (t))-i(t) 

= -J2 <t)p(t) ■ dfMt)) ■ /i( 7 (t)) + E • • /<(7(*)) 

i=l,2 i=l,2 
= -«2(t)p(t)-[/l,/2](7(*))- 

In the same way, if we differentiate the quantity p(t) ■ f2('~f(t)), we obtain 
= ^[p(i)-/2(7(*))]= «i (*)■[/!, / 2 ](7(i))- 
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Therefore, since u is not identically zero in any neighborhood of t, thanks to the con- 
tinuity of the mapping t \— » p(t) ■ [f\, /a](7(i)) we deduce that 

P(t)-[fl,f2](l(t))=0. 

But we already know that pit) ■ /i(7(£)) = pit) • / 2 (7(i)) = 0, where the two vec- 
tors /i(7(f)), / 2 (7(i)) are linearly independent. Therefore, since pit) ^ 0, we con- 
clude that the Lie bracket [fi, f2](j{t)) belongs to the linear subspace spanned by 
/i(7(*))) /2(7(*))> which means that 7(f) belongs to £a- Let us now prove that any 
horizontal path included in £a is singular. Let 7 such a path be fixed, set 7(0) = x, 
and consider a parametrization of A by two vector fields /1, f 2 in a neighborhood V of 
x. Let 6 > be small enough so that 7(2) G V for any t G [0,5], in such a way that 
there is u G L 2 ([0, <5], R 2 ) satisfying 

7 (i) = «i(t)/i(7(t)) + «2(t)/ 2 (7(*)) for a.e. i G [0, <5]. 

Let po € (R 3 )* be such that p • /l(^) = Po ■ h{x) = 0, and let p : [0, 6] -> (R 3 )* be the 
solution to the Cauchy problem 

p(t) = - J] «i(t)p(t) • 4fi(7(<)) for a.e. t G [0, 6], p(0) = p - 

i=l,2 

Define two absolutely continuous function h%,h 2 ■ [0,6] — > 1R by 

h i (t)=p(t)-f i ty(t)) Vte[0,a], Vi = l,2. 
As above, for every £ G [0, 5] we have 

= I • /i(7(*))] = -«2(i)p(t) • / 2 ](7(*)) 

and 

h 2 (t)= Ul (t)p(t)-[f 1 J 2 ]iyit)). 

But since 7(i) G £a for every i, there are two continuous functions Ai, A2 : [0, 6] — > 1R 
such that 

[/i,/a](7(t)) = M*)/i(7(t)) + A 2 (t)/ 2 (7(*)) Vt € M 
This implies that the pair (/ti, h 2 ) is a solution of the linear differential system 

f h^t) = -u 2 it)X 1 it)h 1 (t)-u 2 (t)X 2 (t)h 2 (t) 
\ h 2 (t) = ui(t)Ai(t)^i(t) + «i(t)A 2 (t)/i2(t). 

Since /ii(0) = fo 2 (0) = by construction, we deduce by the Cauchy-Lipschitz Theorem 
that hi(t) = h 2 it) = for any t G [0,6]. In that way, we have constructed an abnormal 
lift of 7 on the interval [0, 6]. We can in fact repeat this construction on a new interval 
of the form [6, 26] (with initial condition p(6)) and finally obtain an abnormal lift of 7 
on [0, 1]. By Proposition 15. 2\ we conclude that 7 is singular. 



34 



B.2 Proof of Proposition 14.81 

The fact that Sa is a closed subset of M is obvious. Let us prove that it is countably 
(n — l)-rectifiable. Since it suffices to prove the result locally, we can assume that we 
have 

A(x) = Span{/i(», . . ■ , f n -i(x)} Vx € V, 

where V is an open neighborhood of the origin in H n . Moreover, doing a change of 
coordinates if necessary, we can also assume that 

fi = j!- + OLi{x)-^— Mi = 1, . . . ,n - 1, 

where each ai : V — > R is a C°° function satisfying Oj(0) = 0. Hence for any 
i, j G {1, . . . n — 1} we have 



[fiJj] 

and so 



da j da{\ ( daj da, t 

1 I I ~ o , - - — O; 



dxi dxj ) \ dx n dx 



_d_ 

dXr 



daj dcti\ ( daj daj 



For every tuple I = (i\,...,if.) € {1, . . . , n — l} fc we denote by // the C°° vector field 
constructed by Lie brackets of /x, /2, ■ • • , f n -i as follows, 

// = [fin [fi2i ■ ■ ■ i [fik-ii fik\ ■■■]]■ 

We call k = length(J) the length of the Lie bracket //. Since A is nonholonomic, there 
is some positive integer r such that 

M" = Span{//(x) | length(I) < r} Vx G V. 

It is easy to see that, for every / such that length (I) > 2, there is a C°° function 
gi : V — ► It such that 

f I { X )=g I (x)l- VXGV. 

Defining the sets as 

A k := {x G V | #j(:r) =0 VI such that length(J) < k} , 

we have 

r 
k=2 

We now observe that, thanks to the Implicit Function Theorem, each set A k \ A k+1 can 
be covered by a countable union of smooth hyper surf aces. Indeed assume that some 
given x belongs to A k \ Ak+i- This implies that there is some J = (ji, ■ ■ ■ , jk+i) of 
length k + 1 such that gj(x) ^ 0. Set / = . . . ,jk+i)- Since gi(x) = 0, we have 

9J{X) = {^- {X) + lt n {x)a ^ x) ) 
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Hence, either q§^(x) ^ or -^-{x) ^ 0. 

Consequently, we deduce that we have the following inclusion 

A k \A k+1 C \J \x G V | 3i € {l,...,n} such that ^(x) ^0 
length(/)=fc 

We conclude easily. Finally, the fact that any Goh path is contained in Sa is obvious. 
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